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Abstract 

The specific recognition of antigen by T cells is critical to the generation of adaptive immune responses in vertebrates. T cells 
recognize antigen using a somatically diversified T-cell receptor (TCR). All jawed vertebrates use four TCR chains called a, fi, y, 
and 8, which are expressed as either a af3 or y8 heterodimer. Nonplacental mammals (monotremes and marsupials) are unusual 
in that their genomes encode a fifth TCR chain, called TCR/x, whose function is not known but is also somatically diversified like 
the conventional chains. The origins of TCR/x are also unclear, although it appears distantly related to TCR5. Recent analysis of 
avian and amphibian genomes has provided insight into a model for understanding the evolution of the TCR5 genes in tetrapods 
that was not evident from humans, mice, or other commonly studied placental (eutherian) mammals. An analysis of the genes 
encoding the TCR5 chains in the duckbill platypus revealed the presence of a highly divergent variable (V) gene, indistinguishable 
from immunoglobulin heavy (IgH) chain V genes (VH) and related to V genes used in TCR/x. They are expressed as part of TCR<5 
repertoire (VH<5) and similar to what has been found in frogs and birds. This, however, is the first time a VH<5 has been found in a 
mammal and provides a critical link in reconstructing the evolutionary history of TCR/x. The current structure of TCR5 and 
TCR/x genes in tetrapods suggests ancient and possibly recurring translocations of gene segments between the IgH and TCR5 
genes, as well as translocations of TCR<5 genes out of the TCRa/5 locus early in mammals, creating the TCR/x locus. 
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Introduction 

T lymphocytes are critical to the adaptive immune system of 
all jawed vertebrates and can be classified into two main 
lineages based on the T~cell receptor (TCR) they use (Rast 
et al. 1997; reviewed in Davis and Chein 2008). The majority 
of circulating human T cells are the a/3T cell lineage which use 
a TCR composed of a heterodimer of a and /3 TCR chains. 
af3T cells include the familiar T cell subsets such as CD4+ 
helper T cells and regulatory T cells, CD8+ cytotoxic T cells, 
and natural killer T (NKT) cells. T cells that are found primarily 
in epithelial tissues and a lower percentage of circulating lym- 
phocytes in some species express a TCR composed of y and 8 
TCR chains. The function of these y8 T cells is less well de- 
fined and they have been associated with a broad range of 
immune responses including tumor surveillance, innate re- 
sponses to pathogens and stress, and wound healing (Hayday 
2009). a/3 and y8 T cells also differ in the way they interact 
with antigen. a/3TCR are major histocompatibility complex 
(MHC) "restricted" in that they bind antigenic epitopes, 
such as peptide fragments, bound to, or "presented" by, mol- 
ecules encoded in the MHC. In contrast, y8TCR have been 
found to bind antigens directly in the absence of MHC, as 
well as self-ligands that are often MHC-related molecules 
(Sciammas et al. 1994; Hayday 2009). 

The conventional TCR chains are composed of two extra- 
cellular domains that are both members of the immunoglob- 
ulin (Ig) domain super-family (reviewed in Davis and Chein 



2008) (fig. 1). The membrane proximal domain is the con- 
stant (C) domain, which is largely invariant amongst T-cell 
clones expressing the same class of TCR chain, and is usually 
encoded by a single, intact exon. The membrane distal 
domain is called the variable (V) domain and is the region 
of the TCR that contacts antigen and MHC. Similar to anti- 
bodies, the individual clonal diversity in the TCR V domains is 
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Fig. 1. Cartoon diagram of the TCR forms found in different species. 
Oblong circles indicate Ig super-family domains and are color coded as C 
domains (blue), conventional TCR V domains (red), and VH<5 or V/x 
(yellow). The gray shaded chains represent the hypothetical partner 
chain for TCR/x and TCR<5 using VH<5. 



© The Author 2012. Published by Oxford University Press on behalf of the society for Molecular Biology and Evolution. 
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http:// 
creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any 
medium, provided the original work is properly cited. 



Open Access 



Mol. Biol. Evol. 29(10)3205-3214. 2012 doi:10.1093/molbev/mss128 Advance Access publication May 15, 2012 



3205 



Parra et al. • doi:10.1093/molbev/mss128 



MBE 



generated by somatic DNA recombination (Tonegawa 1983). 
The exons encoding TCR V domains are assembled somati- 
cally from germ-line gene segments, called the V, diversity (D), 
and joining (J) genes, in developing T cells, a process depen- 
dent upon the enzymes encoded by the recombination acti- 
vating genes (RAG)-1 and RAG-2 (Yancopoulos et al. 1986; 
Schatz et al. 1989). The exons encoding the V domains of TCR 
f3 and 8 chains are assembled from all three types of gene 
segments, whereas the a and y chains use only V and J. The 
different combinations of V, D, and J or V and J, selected from 
a large repertoire of germ-line gene segments, along with 
variation at the junctions due to addition and deletion of 
nucleotides during recombination, contribute to a vast TCR 
diversity. It is this diversity that creates the individual antigen 
specificity of T-cell clones. 

The TCR genes are highly conserved among species in both 
genomic sequence and organization (Rast et al. 1997; Parra 
et al. 2008, 2012; Chen et al. 2009). In all tetrapods examined, 
the TCR/3 and y chains are each encoded at separate loci, 
whereas the genes encoding the a and 8 chains are nested at a 
single locus (TCRa/8) (Chien et al. 1987; Satyanarayana et al. 
1988; reviewed in Davis and Chein 2008). The V domains of 
TCRa and TCR5 chains can use a common pool of V gene 
segments, but distinct D, J, and C genes. 

Diversity in antibodies produced by B cells is also generated 
by RAG-mediated V(D)J recombination and the TCR and Ig 
genes clearly share a common origin in the jawed-vertebrates 
(Flajnik and Kasahara 2010; Litman et al. 2010). However, the 
V, D, J, and C coding regions in TCR have diverged sufficiently 
over the past >400 million years (MY) from Ig genes that they 
are readily distinguishable, at least for the conventional TCR. 
Recently, the boundary between TCR and Ig genes has been 
blurred with the discovery of non-conventional TCR5 iso- 
forms that have been found that use V genes that appear 
indistinguishable from Ig heavy chain V (VH) (Parra et al. 
2010, 2012). Such V genes have been designated as VH<5 
and have been found in both amphibians and birds (fig. 1). 
In the frog Xenopus tropicalis, and a passerine bird, the zebra 
finch Taeniopygia guttata the VH<5 are located within the 
TCRa/8 loci where they co-exist with conventional Va and 
V<5 genes (Parra et al. 2010, 2012). In galliform birds, such as 
the chicken Gallus gallus, VH<5 are present but located at a 
second TCR8 locus that is unlinked to the conventional 
TCRa/8 (Parra et al. 2012). VH<5 are the only type of V gene 
segment present at the second locus and, although closely 
related to antibody VH genes, the VH<5 appear to be used 
exclusively in TCR5 chains. This is true as well for frogs where 
the TCRa/8 and IgH loci are tightly linked (Parra et al. 2010). 

The TCRa/8 loci have been characterized in several euthe- 
rian mammal species and at least one marsupial, the opossum 
Monodelphis domestica, and VH5 genes have not been found 
to date (Satyanarayana et al.1988; Wang et al. 1994; Parra 
et al. 2008). However, marsupials do have an additional 
TCR locus, unlinked to TCRa/8, that uses antibody-related 
V genes. This fifth TCR chain is called TCR/x and is related to 
TCR5, although it is highly divergent in sequence and struc- 
ture (Parra et al. 2007, 2008). A TCR/x has also been found in 
the duckbill platypus and is clearly orthologous to the 



marsupial genes, consistent with this TCR chain being ancient 
in mammals, although it has been lost in the eutherians (Parra 
et al. 2008; Wang et al. 2011). TCR/x chains use their own 
unique set of V genes (V/x) (Parra et al. 2007; Wang et al. 
2011). Trans-locus V(D)J recombination of V genes from 
other Ig and TCR loci with TCR/x genes has not been 
found. So far, TCR/x homologues have not been found in 
non-mammals (Parra et al. 2008). 

TCR/x chains are atypical in that they contain three 
extra-cellular IgSF domains rather than the conventional 
two, due to an extra N-terminal V domain (fig. 1) (Parra 
et al. 2007; Wang et al. 2011). Both V domains are encoded 
by a unique set of V/x genes and are more related to Ig VH 
than to conventional TCR V domains. The N-terminal V 
domain is diverse and encoded by genes that undergo 
somatic V(D)J recombination. The second or supporting 

V domain has little or no diversity. In marsupials this V 
domain is encoded by a germ-line joined, or pre-assembled, 

V exon that is invariant (Parra et al. 2007). The second V 
domain in platypus is encoded by gene segments requiring 
somatic DNA recombination; however, only limited diversity 
is generated partly due to the lack of D segments (Wang et al. 
201 1 ). A TCR chain structurally similar to TCR/x has also been 
described in sharks and other cartilaginous fish (fig. 1) 
(Criscitiello et al. 2006; Flajnik et al. 2011). This TCR, called 
NAR-TCR, also contains three extracellular domains, with the 
N-terminal V domain being related to those used by IgNAR 
antibodies, a type of antibody found only in sharks (Greenberg 
et al. 1995). The current working model for both TCR/x and 
NAR-TCR is that the N-terminal V domain is unpaired and 
acts as a single, antigen binding domain, analogous to the V 
domains of light-chainless antibodies found in sharks and 
camelids (Flajnik et al. 2011; Wang et al. 2011). 

Phylogenetic analyses support the origins of TCR/x occur- 
ring after the avian-mammalian split (Parra et al. 2007; Wang 
et al. 2011). Previously, we hypothesized the origin of TCR/x 
being the result of a recombination between ancestral IgH 
and TCR<5-like loci (Parra et al. 2008). This hypothesis, how- 
ever, is problematic for a number of reasons. One challenge is 
the apparent genomic stability and ancient conserved syn- 
teny in the region surrounding the TCRa/8 locus; this region 
has appeared to remain stable over at least the past 350 MY of 
tetrapod evolution (Parra et al. 2008, 2010). The discovery of 
VH<5 genes inserted into the TCRa/5 locus of amphibians and 
birds has provided an alternative model for the origins of 
TCR/x; this model involves both the insertion of VH followed 
by the duplication and translocation of TCR genes. Here we 
present the model along with supporting evidence drawn 
from the structure of the platypus TCRa/8 locus, which is 
also the first analysis of this complex locus in a monotreme. 

Materials and Methods 

Identification and Annotation of the Platypus 
TCRa/S Locus 

The analyses were performed using the platypus 
(Omithorhynchus anatinus) genome assembly version 
5.0.1 (http://www.ncbi.nlm.nih.gov/genome/guide/platypus/). 
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The platypus genome was analyzed using the whole- 
genome BLAST available at NCBI (www.ncbi.nlm.nih. 
gov/) and the BLAST/BLAT tool from Ensembl (www. 
ensembl.org). The V and J segments were located by sim- 
ilarity to corresponding segments from other species and 
by identifying the flanking conserved recombination signal 
sequences (RSS). V gene segments were annotated 5' to 3 ; 
as Wa or V<5 followed by the family number and the gene 
segment number if there were greater than one in the 
family. For example, Va15.7 is the seventh Wa gene in 
family 15. The D segments were identified using comple- 
mentarity-determining region-3 (CDR3) sequences that 
represent the V-D-J junctions, from cDNA clones using 
VH<5. Platypus TCR gene segments were labeled according 
to the IMGT nomenclature (http://www.imgt.org/). The 
location for the TCRa/5 genes in the platypus genome 
version 5.0.1 is provided in supplementary table S1, 
Supplementary Material online. 

Confirmation of Expression of Platypus VH5 
Reverse transcription PCR (RT-PCR) was performed on total 
splenic RNA extracted from a male platypus from the Upper 
Barnard River, New South Wales, Australia. This platypus was 
collected under the same permits as in Warren et al. (2008). 
The cDNA synthesis step was carried out using the Invitrogen 
Superscript Ill-first strand synthesis kit according to the man- 
ufacturer's recommended protocol (Invitrogen, Carlsbad, CA, 
USA). TCR5 transcripts containing VH<5 were targeted using 
primers specific for the C8 and VH<5 genes identified in the 
platypus genome assembly (Warren et al. 2008). PCR ampli- 
fication was performed using the QIAGEN HotStar HiFidelity 
Polymerase Kit (BD Biosciences, CLONTECH Laboratories, 
Palo Alto, CA, USA) in total volume of 20 /xl containing 1 x 
Hotstar Hifi PCR Buffer (containing 0.3 mM dNTPs), 1/zM of 
primers, and 1.25U Hotstar Hifidelity DNA polymerase. The 
PCR primers used were ^-GTACCGCCAACCACCAGGGAAA 
G-3 ; and 5 / -CAGTTCACTGCTCCATCGCTTTCA-3 / for the 
VH<5 and C8, respectively. A previously described platypus 
spleen cDNA library constructed from RNA extracted from 
tissue from a Tasmanian animal was also used (Vernersson 
et al. 2002). 

PCR products were cloned using TopoTA cloning® kit 
(Invitrogen). Sequencing was performed using the BigDye ter- 
minator cycle sequencing kit version 3 (Applied Biosystems, 
Foster City, CA, USA) and according to the manufacturer 
recommendations. Sequencing reactions were analyzed 
using the ABI Prism 3100 DNA automated sequences 
(Perkin Elmer Life and Analytical Sciences, Wellesley, MA, 
USA). Chromatograms were analyzed using the Sequencher 
4.9 software (Gene Codes Corporation, Ann Arbor, Ml, USA). 
Sequences have been archived on Gen Bank under accession 
numbers JQ664690-JQ664710. 

Phylogenetic Analyses 

Nucleotide sequences from FR1 to FR3 of the V genes regions, 
including CDR1 and CDR2, were aligned using BioEdit (Hall 
1999) and the accessory application ClustalX (Thompson 



et al. 1997). Nucleotide alignments analyzed were based on 
amino acid sequence to establish codon position (Hall 1999). 
Alignments were corrected by visual inspection when neces- 
sary and were then analyzed using the MEGA Software 
(Kumar et al. 2004). Neighbor joining (NJ) with uncorrected 
nucleotide differences (p-distance) and minimum evolution 
distances methods were used. Support for the generated trees 
was evaluated based on bootstrap values generated by 1000 
replicates. GenBank accession numbers for sequences used 
in the tree construction are in supplementary table S2, 
Supplementary Material online. 

Results and Discussion 

The TCRa/8 locus was identified in the current platypus 
genome assembly and the V, D, J, and C gene segments and 
exons were annotated and characterized (fig. 2). The majority 
of the locus was present on a single scaffold, with the remain- 
der on a shorter con tig (fig. 2). Flanking the locus were SALL2, 
DAD1 and several olfactory receptor (OR) genes, all of which 
share conserved synteny with the TCRa/8 locus in amphib- 
ians, birds, and mammals (Parra et al. 2008, 2010, 2012). The 
platypus locus has many typical features common to TCRa/8 
loci in other tetrapods (Satyanarayana et al. 1988; Wang et al. 
1994; Parra et al. 2008, 2010, 2012). Two C region genes were 
present: a Ca that is the most 3' coding segment in the locus, 
and a C8 oriented 5' of the )a genes. There is a large number 
of )a gene segments (n - 32) located between the C8 and Ca 
genes. Such a large array of )a genes are believed to facilitate 
secondary Wa to ]a rearrangements in developing a/3T cells if 
the primary rearrangements are nonproductive or need re- 
placement (Hawwari and Krangel 2007). Primary TCRa V-J 
rearrangments generally use )a segments towards the 5'-end 
of the array and can progressively use downstream )a in sub- 
sequent rearrangements. There is also a single V<5 gene in 
reverse transcriptional orientation between the platypus C8 
gene and the )a array that is conserved in mammalian 
TCRa/8 both in location and orientation (Parra et al. 2008). 

There are 99 conventional TCR V gene segments in the 
platypus TCRa/8 locus, 89 of which share nucleotide identity 
with Va in other species and 10 that share identity with V8 
genes. The V8 genes are clustered towards the 3'-end of the 
locus. Based on nucleotide identity shared among the platy- 
pus V genes they can be classified into 17 different Va families 
and two different V8 families, based on the criteria of a V 
family sharing >80% nucleotide identity (not shown, but 
annotated in fig. 2). This is also a typical level of complexity 
for mammalian Va and V8 genes (Giudicelli et al. 2005; Parra 
et al. 2008). Also present were two D8 and seven ]8 gene 
segments oriented upstream of the C8. All gene segments 
were flanked by canonical RSS, which are the recognition 
substrate of the RAG recombinase. The D segments were 
asymmetrically flanked by an RSS containing at 12 bp 
spacer on the 5'-side and 23 bp spacer on the 3 / -side, as has 
been shown previously for TCR D gene segments in other 
species (Carroll et al. 1993; Parra et al. 2007, 2010). In sum- 
mary, the overall content and organization of the platypus 
TCRa/8 locus appeared fairly generic. 
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Fig. 2. Annotated map of the platypus TCRa/8 locus showing the locations of the Wa and W8 (red), VH<5 (yellow), D8 (orange), )a and )8 (green), C8 
(dark blue), and Ca (light blue). Conserved syntenic genes are in gray. The scaffold and contig numbers are indicated. 



What is atypical in the platypus TCRa/8 locus was the 
presence of an additional V gene that shared greater identity 
to antibody VH genes than to TCR V genes (figs. 2 and 3). This 
V gene segment was the most proximal of the V genes to the 
D and J genes and was tentatively designated as VH<5. VH<5 are, 
by definition, V genes indistinguishable from Ig VH genes but 
used in encoding TCR5 chains and have previously been 



found only in the genomes of birds and frogs (Parra et al. 
2008, 2010, 2012). 

VH genes from mammals and other tetrapods have been 
shown to cluster into three ancient clans and individual spe- 
cies differ in the presence of one or more of these clans in 
their germ-line IgH locus (Tutter and Riblet 1989; Ota and Nei 
1994). For example, humans, mice, echidnas, and frogs have 
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VH genes from all three clans (Schwager et al. 1989; Ota and 
Nei 1994; Belov and Hellman 2003), whereas rabbits, opos- 
sums, and chickens have only a single clan (McCormack et al. 
1991; Butler 1997; Johansson et al. 2002; Baker et al. 2005). In 
phylogenetic analyses, the platypus VH5 was most related to 
the platypus V/x genes found in the TCR/x locus in this spe- 
cies (fig. 3). Platypus VH<5, however, share only 51-61% 
nucleotide identity (average 56.6%) with the platypus V/x 
genes. Both the platypus V/x and VH<5 clustered within clan 
III (fig. 3) (Wang et al. 2011). This is noteworthy given that 
VH genes in the platypus IgH locus are also clan III and, in 
general, clan III VH are the most ubiquitous and conserved 
lineage of VH (Johansson et al. 2002; Tutter and Riblet 1989). 
Although clearly related to platypus VH, the VH<5 gene 
share only 34-65% nucleotide identity (average 56.9%) 
with the bona fide VH used in antibody heavy chains in this 
species. 

It was necessary to rule out that the VH<5 gene present in 
the platypus TCRa/8 locus was not an artifact of the genome 
assembly process. One piece of supporting evidence would be 
the demonstration that the VH<5 is recombined to down- 
stream D8 and ]8 segments and expressed with C8 in com- 
plete TCR5 transcripts. PCR using primers specific for VH5 
and C8 was performed on cDNA synthesized from splenic 
RNA from two different platypuses, one from New South 
Wales and the other from Tasmania. PCR products were 
successfully amplified from the NSW animal and these were 
cloned and sequenced. Twenty clones, each containing 
unique nucleotide sequence, were characterized and found 
to contain the VH<5 recombined to the D8 and )8 gene 
segments (fig. 4A). Of these 20, 11 had unique V, D, and J 
combinations that would encode 11 different complemen- 
tarity-determining regions-3 (CDR3) (fig. 4B). More than half 
of the CDR3 (8 out of 1 1) contained evidence of using both D 
genes (VDDJ) (fig. 4B). This is a common feature of TCR5 V 
domains where multiple D genes can be incorporated into 
the recombination due to the presence of asymmetrical RSS 
(Carroll et al. 1993). The region corresponding to the junc- 
tions between the V, D, and J segments, contained additional 
sequence that could not be accounted for by the germ-line 
gene segments (fig. 4B). There are two possible sources of 
such sequence. One are palindromic (P) nucleotides that 
are created during V(D)J recombination when the RAG gen- 
erates hairpin structures that are resolved asymmetrically 
during the re-ligation process (Lewis 1994). The second are 
non-templated (N) nucleotides that can be added by the 
enzyme terminal deoxynucleotidyl transferase (TdT) during 
the V(D)J recombination process. An unusual feature of the 
platypus VH5 is the presence of a second cysteine encoded 
near the 3 ; -end of the gene, directly next to the cysteine 



Fig. 3. Phylogenetic tree of mammalian VH genes including the platy- 
pus VH<5 and monotreme V/x. The three major VH clans are bracketed. 
The platypus VH<5 is boxed and the clade containing platypus VH<5 along 
with platypus and echidna V/x is in bold and indicated by a smaller 
bracket in VH clan III. The three-digit numbers following the VH gene 



labels are the last three digits of the Gen Bank accession number refer- 
enced in supplementary table S2, Supplementary Material online. The 
numbers following the platypus and echidna V/x labels are clone num- 
bers. The tree presented was generated using the Minimum Evolution 
method. Similar topology was generation using the Neighbor Joining 
method. 
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Fig. 4. (A) Alignment of predicted protein sequence of transcripts containing a recombined VH<5 gene isolated from platypus spleen RNA. The 
individual clones are identified by the last three digits of their GenBank accession numbers (JQ664690-JQ664710). Shown is the region from FR3 of 
the VH<5 through the beginning of the C8 domain. The sequence in bold at the top of the alignment is the germ-line VH<5 and C8 gene sequence. The 
double cysteines at the end of FR3 and unpaired cysteines in CDR3 are shaded, as is the canonical FGXG in FR4. (B) Nucleotide sequence of the CDR3 
region of the eleven unique V(D)J recombinants using VH<5 described in the text. The germ-line sequence of the 3 r -end of VH<5, the two D8, are shown at 
the top. The germ-line }8 sequences are shown on the right-hand side of the alignment interspersed amongst the cDNA sequences using each. 
Nucleotides in the junctions between the V, D, and J segments, shown italicized, are most likely N-nucleotides added by TdT. 



predicted to form the intra-domain disulfide bond in Ig do- 
mains (fig. 4A). Additional cysteines in the CDR3 region of VH 
domains have been thought to provide stability to unusually 
long CDR3 loops, as has been described for cattle and the 
platypus previously (Johansson et al. 2002). The CDR3 of 
TCR5 using VH<5 are only slightly longer than conventional 
TCR8 chains (ranging 10-20 residues) (Rock et al. 1994; Wang 
et al. 2011). Furthermore, the stabilization of CDR3 generally 
involves multiple pairs of cysteines, which were not present in 
the platypus VH<5 clones (fig. 4A). Attempts to amplify TCR5 
transcripts containing VH<5 from splenic RNA obtained from 
the Tasmanian animal were unsuccessful. As a positive con- 
trol, TCR5 transcripts containing conventional \/a/8 were 
successfully isolated, however. It is possible that Tasmanian 
platypuses, which have been separated from the mainland 
population at least 14,000 years either have a divergent VH<5 
or have deleted this single V gene altogether (Lambeck and 
Chappell 2001). 

Although there is only a single VH<5 in the current platypus 
genome assembly, there was sequence variation in the region 
corresponding to FR1 through FR3 of the V domains (fig. 4A 
and sequence data not shown but available in GenBank). 
Some of this variation could represent two alleles of a single 
VH<5 gene. Indeed, the RNA used in this experiment is from a 
wild-caught individual from the same population that was 
used to generate the whole-genome sequence and was found 
to contain substantial heterozygosity (Warren et al. 2008). 
There was greater variation in the transcribed sequences, 
however, than could be explained simply by two alleles of a 
single gene (fig. 4A). Two alternative explanations are the 
occurrence of somatic mutation of expressed VH<5 genes or 



allelic variation in gene copy number. Somatic mutation in 
TCR chains is controversial. Nonetheless, it has been invoked 
to explain the variation in expressed TCR chains that exceeds 
the apparent gene copy number in sharks, and has also been 
postulated to occur in salmonids (Yazawa et al. 2008; Chen 
et al. 2009). Therefore, it does not seem to be out of the realm 
of possibility that somatic mutation is occurring in platypus 
VH<5. Indeed, the mutations appear to be localized to the V 
region with no variation in the C region (fig. 4A). This may be 
due to its relatedness of VH<5 to Ig VH genes where somatic 
hyper-mutation is well documented. Such somatic mutation 
contributes to overall affinity maturation in secondary anti- 
body responses (Wysocki et al. 1986). The pattern of muta- 
tion seen in platypus VH<5 however, is not localized to the 
CDR3, which would be indicative of selection for affinity mat- 
uration, but was also found in the framework regions. 
Furthermore, in the avian genomes where there is also only 
a single VH<5, there was no evidence of somatic mutation in 
the V regions (Parra et al. 2012). The contribution of muta- 
tion to the platypus TCR5 repertoire, if it is occurring, remains 
to be determined. Alternatively, the sequence polymorphism 
may be due to VH<5 gene copy number variation between 
individual TCRa/5 alleles. 

Irrespective of the number of VH<5 genes in the platypus 
TCRa/8 locus, the results clearly support TCR5 transcripts 
containing VH<5 recombined to D8 and ]8 gene segments in 
the TCRa/8 locus (fig. 4). A VH<5 gene or genes in the platypus 
TCRa/8 locus in the genome assembly, therefore, does not 
appear to be an assembly artifact. Rather it is present, func- 
tional and contributes to the expressed TCR5 chain reper- 
toire. The possibility that some platypus TCRa/5 loci contain 
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more than a single VH5 does not alter the principal conclu- 
sions of this study. 

Previously, we hypothesized the origin of TCR/x in mam- 
mals involving the recombination between and ancestral 
TCRa/8 locus and an IgH locus (Parra et al. 2008). The IgH 
locus would have contributed the V gene segments at the 
5' -end of the locus, with the TCR5 contributing the D, J, and C 
genes at the 3'-end of the locus. The difficulty with this 
hypothesis was the clear stability of the genome region sur- 
rounding the TCRa/8 locus. In other words, the chromosomal 
region containing the TCRa/8 locus appears to have remained 
relatively undisrupted for at least the past 360 million years 
(Parra et al. 2008, 2010, 2012). The discovery of VH<5 genes 
within the TCRa/8 loci of frog and zebra finch is consistent 
with insertions occurring without apparently disrupting the 
local syntenic region. In frogs, the IgH and TCRa/8 loci are 
tightly linked, which may have facilitated the translocation of 
VH genes into the TCRa/8 locus (Parra et al. 2010). However, 
close linkage is not a requirement since the translocation of 
VH genes appears to have occurred independently in birds 
and monotremes, due to the lack of similarity between the 
VH<5 in frogs, birds, and monotremes (Parra et al. 2012). 
Indeed, it would appear is if the acquisition of VH genes 
into the TCRa/8 locus occurred independently in each 
lineage. 

The similarity between the platypus VH<5 and V genes in 
the TCR/x locus is, so far, the clearest evolutionary association 
between the TCR/x and TCR5 loci in one species. From the 
comparison of the TCRa/8 loci in frogs, birds, and mono- 
tremes, a model for the evolution of TCR/x and other 
TCR5 forms emerges (fig. 5), which can be summarized as 
follows: 

1) Early in the evolution of tetrapods, or earlier, a duplica- 
tion of the D-J-G5 cluster occurred resulting in the pres- 
ence of two C8 each with its own set of D8 and ]8 
segments (fig. 5A). 

2) Subsequently, a VH gene or genes was translocated from 
the IgH locus and inserted into the TCRa/8 locus, most 
likely to a location between the existing Va/V<5 genes and 
the ^-proximal D-J-G5 cluster (fig. 5B). This resulted in 
the configuration like that which currently exists in the 
zebra finch genome (Parra et al. 2012). 

3) In the amphibian lineage there was an inversion of the 
region containing \/H8-D8-]8-C8 cluster and an expan- 
sion in the number of VH<5 genes (fig. 5C). Currently, X. 
tropicalis has the greatest number of VH<5 genes, where 
they make up the majority of V genes available in the 
germ-line for use in TCR5 chains (Parra et al. 2010). 

4) In the galliform lineage (chicken and turkey), the VH<5- 
D8-)8-C8 cluster was trans-located out of the TCRa/8 
locus where it currently resides on another chromosome 
(fig. 5D). There are no Wa or V<5 genes at the site of the 
second chicken TCR5 locus and only a single C8 gene 
remains in the conventional TCRa/8 locus (Parra et al. 
2012). 

5) Similar to galliform birds, the VH8-D8-)8-C8 cluster was 
trans-located out of the TCRa/8 locus in presumably the 



last common ancestor of mammals, giving rise to TCR/x 
(fig. 5E). Internal duplications of the VH5-D5-J5 genes 
gave rise to the current [(V-D-J) - (V-D-J) - C] or- 
ganization necessary to encode TCR chains with double 
V domains (Parra et al. 2007, Wang et al. 2011). In the 
platypus, the second V-D-J cluster, encoding the sup- 
porting V, has lost its D segments and generates V do- 
mains with short CDR3 encoded by direct V to J 
recombination (Wang et al. 2011). The whole cluster 
appears to have undergone additional tandem duplica- 
tion as it exists in multiple tandem copies in the opos- 
sum and also likely in the platypus (Parra et al. 2007, 
2008; Wang et al. 2011). 
6) In the therian lineage (marsupials and placentals), 
the VH<5 was lost from the TCRa/8 locus (Parra et al. 
2008). In placental mammals, the TCR/x locus was also 
lost (Parra et al. 2008). The marsupials retained TCR/x, 
however the second set of V and J segments, encoding 
the supporting V domain in the protein chain, were 
replaced with a germ-line joined V gene, in a process 
most likely involving germ-line V(D)J recombination 
and retro-transposition (fig. 5F) (Parra et al. 2007, 2008). 

TCR forms such as TCR/x, which contain three 
extracellular domains, have evolved at least twice in verte- 
brates. The first was in the ancestors of the cartilaginous fish 
in the form of NAR-TCR (Criscitiello et al. 2006) and the 
second in the mammals as TCR/x (Parra et al. 2007). 
NAR-TCR uses an N-terminal V domain related to the V 
domains found in IgNAR antibodies, which are unique to 
cartilaginous fish (Greenberg et al. 1995; Criscitiello et al. 
2006), and not closely related to antibody VH domains. 
Therefore, it appears that NAR-TCR and TCR/x are more 
likely the result of convergent evolution rather than being 
related by direct descent (Parra et al. 2007; Wang et al. 
2011). Similarly, the model proposed in fig. 5 posits the 
direct transfer of VH genes from an IgH locus to the 
TCRa/8 locus. But it should be pointed out the VH5 
found in frogs, birds, and monotremes are not closely related 
(fig. 3); indeed, they appear derived each from different, 
ancient VH clans (birds, VH clan I; frogs VH clan II; platypus 
VH clan III). This observation would suggest that the transfer 
of VH<5 into the TCRa/8 loci occurred independently in the 
different lineages. Alternatively, the transfer of VH genes into 
the TCRa/8 locus may have occurred frequently and repeat- 
edly in the past and gene replacement is the best explana- 
tion for the current content of these genes in the different 
tetrapod lineages. The absence of VH<5 in marsupials, the 
highly divergent nature of V/x genes in this lineage, and 
the absence of conserved synteny with genes linked to 
TCR/x in the opossum, provide little insight into the origins 
of TCR/x and its relationship to TCR5 or the other conven- 
tional TCR (Parra et al. 2008). The similarity between VH, 
VH5, and V/x genes in the platypus genome, which are all 
clan III, however is striking. In particular, the close relation- 
ship between the platypus VH<5 and V/x genes lends greater 
support for the model presented in fig. 5E, with TCR/x 
having been derived from TCR<5 genes. 
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Fig. 5. A model of the stages of evolution of the TCRa/8 loci in tetrapods and the origins of TCR/x in mammals. A color key of the gene segments is 
presented at the bottom. (A) Depiction of the D5-J5-G5 duplication in an ancestral TCRa/8 locus that provides a second C8 gene found in frogs and 
zebra finch. (B) Depiction of the insertion of a VH gene into the TCRa/8 locus producing a current organization as it is found in zebra finch. 
(C) Depiction of the inversion/translocation and VH<5 gene duplication that yielded the current organization found in frogs. (D) Depiction of the 
translocation of a VH8-D8-]8-C8 cluster to a location outside the TCRa/8 locus generating a second TCR<5 locus as it is currently found in chicken and 
turkey. (£) Depiction the translocation that took place in mammals giving rise to the TCR/x locus. (F) Loss of TCR/x in placental mammals, loss of D gene 
segments in cluster encoding the support V domain, retro-transpostion to form a germ-line joined V in marsupials, and duplication of TCR/x clusters in 
both monotremes and marsupials. 
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The presence of TCR chains that use antibody like V do- 
mains, such as TCR5 using VH<5, NAR-TCR or TCR/x are 
widely distributed in vertebrates with only the bony fish 
and placental mammals missing. In addition to NAR-TCR, 
some shark species also appear to generate TCR chains 
using antibody V genes. This occurs via trans-locus V(D)J 
recombination between IgM and IgW heavy chain V genes 
and TCR<3 and TCRa D and J genes (Criscitiello et al. 2010). 
This may be possible, in part, due to the multiple clusters of Ig 
genes found in the cartilaginous fish. It also illustrates that 
there has been independent solutions to generating TCR 
chains with antibody V domains in different vertebrate line- 
ages. In the tetrapods, the VH genes were trans-located into 
the TCR loci where they became part of the germ-line reper- 
toire. Whereas in cartilaginous fish something equivalent may 
occur somatically during V(D)J recombination in developing 
T cells. Either mechanism suggests there has been selection 
for having TCR using antibody V genes over much of verte- 
brate evolutionary history. 

The current working hypothesis for such chains is that they 
are able to bind native antigen directly. This is consistent with 
a selective pressure for TCR chains that may bind or recognize 
antigen in ways similar to antibodies in many different line- 
ages of vertebrates. In the case of NAR-TCR and TCR/x, the 
N-terminal V domain is likely to be unpaired and bind antigen 
as a single domain (fig. 1), as has been described for IgNAR 
and some IgG antibodies in camels (recently reviewed in 
Flajniket al. 2011). This model of antigen binding is consistent 
with the evidence that the N-terminal V domains in TCR/x 
are somatically diverse, while the second, supporting V do- 
mains have limited diversity with the latter presumably per- 
forming a structural role rather than one of antigen 
recognition (Parra et al. 2007; Wang et al. 2011). There is 
no evidence of double V domains in TCR<5 chains using 
VH<5 in frogs, birds, or platypus (fig. 1) (Parra et al. 2010, 
2012). Rather, the TCR complex containing VH<5 would 
likely be structured similar to a conventional y<5TCR with a 
single V domain on each chain. It is possible that such recep- 
tors also bind antigen directly, however this remains to be 
determined. 

A compelling model for the evolution of the Ig and TCR 
loci has been one of internal duplication, divergence and de- 
letion; the so-called birth-and-death model of evolution of 
immune genes promoted by Nei and colleagues (Ota and Nei 
1994; Nei et al. 1997). Our results in no way contradict that 
the birth-and-death mode of gene evolution has played a 
significant role in shaping these complex loci. However, our 
results do support the role of horizontal transfer of gene 
segments between the loci that has not been previously ap- 
preciated. With this mechanism T cells may have been able to 
acquire the ability to recognize native, rather than processed 
antigen, much like B cells. 

Supplementary Material 

Supplementary tables S1 and S2 are available at Molecular 
Biology and Evolution (http://www.mbe.oxfordjournals.org/). 
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